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Abstract. - With the explosive growth of accessible information, expecially on the Internet, 
evaluation-based filtering has become a crucial task. Various systems have been devised aiming 
to sort through large volumes of information and select what is likely to be more relevant. In 
this letter we analyse a new ranking method, where the reputation of information providers is 
determined self-consistently. 



Introduction. - The study of complex networks and of some dynamical processes taking 
place on these structures has recently attracted a great deal of attention in the physics com- 
munity [1—4]. The importance of technological networks, such as the Internet, lies mostly in 
the increased communication capabilities [5,6], which make information progressively easier 
to produce and distribute. As storage and transmission costs continue to drop, an overabun- 
dance of information threatens to overwhelm its recipients. It is, therefore, crucial to process 
information in order to present a user only the one that answers best her requests [7] . 

An important aspect of information filtering regards scoring systems in the World Wide 
Web [8,9]. They collect evaluations and aggregate them into published scores that are mean- 
ingful to the final user. This embraces many different instances, ranging from commercial 
websites, where buyers evaluate sellers (Ebay, Amazon, etc.) to new generation search en- 
gines (Google, Yahoo, etc.), and opinion websites, where people evaluate objects (Epinions, 
Tailrank, etc.) Since the evaluators carry different expertise, it is important to estimate how 
accurate a given vote may be and to weight it accordingly. This can be done through the use 
of raters' reputations [10]. Reputation summarises one's past behaviour and has always been 
used to bear the risk of interacting with strangers. The Internet, while enhancing such a risk, 
brings in the possibility to find its antidotes [11]. Since nobody knows a-priori who are the 
honest and competent evaluators, in fact, online scoring systems often include some measure 
of their past performance. This gives users an indication on how trustworthy a given piece of 
information is supposed to be. An expert of the field would probably obtain a high reputa- 
tion; experts' votes should then count more when aggregating the scores. While reputation is 
usually obtained by asking users supplementary evaluations about other users, the procedure 
of Iterative Refinement (IR), which can be shown to outperform naive methods [12], does not 
require to explicitly rate the raters. 
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The aim of this letter is to study, in a generalised model, the IR method's dependence 
on the relevant parameters, illustrate the subtle issues in its mathematical underpinning and 
elaborate on distortions generated by different kinds of cheating. Prior to describing the major 
focus of this work, we will briefly state the model and define some notations. 

Model and algorithm. - To describe our approach in the simplest manner, let us consider 
N raters evaluating M objects, which can be books, movies or even other raters. Each object 
I has an intrinsic quality Qi and each rater i has an intrinsic judging power 1/of. Let Xu be 
a random variable representing the rating given by rater i to object I. Intrinsic qualities and 
judging powers are defined by the first two moments of its distribution: 

(xu) = [in =Qi + A« (1) 
{{x a -M?)=°i, (2) 

where Aj; is the systematic error of agent i towards object /. Expectation values are taken over 
the distribution of xu. They can be regarded as ensemble averages, obtained if the evaluations 
were to be performed infinitely many times. Our aim is then to extract the quality of each 
object from a single set {xu} of evaluations. We thus estimate the intrinsic quality Qi of 
object I by a weighted average of the received votes 

N 



the inverse judging power of of rater i is estimated by the sample variance Vi 

M 

M 



M 



1=1 

The unnormalised weights uji take the general form 

0Ji = V~P , (5) 

with f3 > and = u>i/ J2j u 'j ■ As sucn i decreases when Vi increases because rater i has 
a lower judging power and should be given less credit. We will consider scenarios where /3 
equals 1 or 1/2. The case (3 — 1/2, in fact, exhibits scale-changing and translational invariance 
because qi becomes a sum of dimensionless random variables; the case (3—1 corresponds to 
optimal weights, as explained later in the section No systematic errors. 

The IR algorithm allows to solve eqs. (I3I5|I . thus estimating Qi and <7j, via the following 
recursive procedure: 7) Without additional information, set uii — 1/N Vi = 1,2, ...,N. IT) 
Estimate qi with eq. 10. Ill) Estimate Vi with eq. and plug it in eq. JSJ) to find the 
weights. IV) Repeat from step II. Numerical simulations show that this process converges to 

the minimum of the cost function E({q{\) — J^. J"\ (xu — gz)!^ 
conventional methods. 



-P 2 



much faster than other 



Analytical approach. - Eq. J2J implies that the random variable (xu — fin) 2 has mean 
of and variance mfaf, which is determined by the distribution of xu; in particular, mf = 3 
if the distribution of votes is itself Gaussian. Let us define the variable jij — jt^2i(xu — 
Qi)(xji — Qi); provided that xu has finite moments of, at least, order 4, in the large M limit 
one obtains 

-»■ a?S,n + A, A, + —== ( ei„ + A7/u + Alhi ) . (6) 
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Here the overlined quantities represent averages over the M items, Xi = ■jj'YliiXu. The 
Gaussian random variables and hi have mean zero and variances var(e,j) = m?jafoj and 
var(/ij) = of, where m 2 j = 1 + 5ij(m 2 — 1). In the following we shall use the notation gi — en. 

Eq. has to be interpreted in probability, as prescribed by the Central Limit Theorem. 
In its derivation we have further assumed that raters are independent; in fact, the correlation 
among the variables ey of different indices diminishes as M increases. If M ^> 1, the random 
variables {e^} are effectively independent, with the first visible triangular correlation of order 
1/M 2 or smaller. From counting the degrees of freedom associated with random numbers, it 
is desirable to have M > (N + l)/2. Eq. © forms the basis of our analytical pursuit in the 
later development. 

The performance of the IR method can be stated by measuring the following mean squared 
errors: 



d qi = ((q l -Q t ) 2 ) = ((s l + A l ) 2 ) (7) 
= {(V-a 2 ) 2 ) = Var(y 4 )+Bias 2 (^) , (8) 

with Bias(l^) = (Vi) — of. In eq. J7J we have separated the systematic error part, making 

use of the variables A; = J2i fi^-U and s i = J2j(Vjl ~ Ql)fh witn Uu = X U ~ Ec l s - <QE|) 
guarantee that the first two moments of (yu — Qi) are independent of index I, therefore 
( s f) = 17 Sf=i ( s f)- This permits us to employ eq. I© to obtain 

i i,j \ V / 

The variable s; becomes Gaussian in the large N limit, as long as the weights fj are fixed and 
satisfy the Lindeberg condition [13]. However, such inference can't be drawn easily because 
the weights and the estimated qi are tangled up in eqs. (|3I5|I . The standard deviation of Si can, 
nevertheless, be calculated. The general problem of finding intrinsic values from completely 
distorted votes is not solvable. In fact, even if one disposed of an infinite number of raters and 
evaluations, the estimator @ of Qi would always be biased of the amount (A;). We shall, in 
the following, focus our attention on three particular cases of special interest. 

No systematic errors. When An = V«, I, raters are impartial but posses different 
judging powers. In order to obtain the best quality estimator one can minimise the mean 
squared error d q ({ujk}) of © with respect to the uii's. This gives the optimal weights [14], 
(3 = 1 in (0), with minimal d q ({l/al}) = l~/J2iO-~ 2 . Since the law of large numbers 
guarantees the convergence of d q ({l/ak}) to zero for large N, the same must obviously be 
true for optimal weights. Unfortunately, it is not possible to state that the choice (3 = 1 
is optimal if the of 's are not known in advance. Although the convergence of qi — > Qi for 
N — > oo is guaranteed, the small deviation \qi — Qi\ due to finite N will propagate to the 
estimate of erf and render Vi ^ of, even when M — » oo. A recursive procedure allows to 
calculate the expectation values for (fi)\ using eq. ©, it is straightforward to show that 



Vi = 



2 , 9i 
Cr- 



'M 



Now we use a;, = V^ and, after iterative substitutions, we may express u>i in terms of 
(Tj's and random variables {e^}. One may then compute /j and plug it in eqs. I|7I9|) . Let 
us define G(b) = jj J2i m i a ^ b an( i denote by angular brackets a simple average over the 
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Fig. 1 - Average squared difference d a between given and predicted variance, as a function of M in 
log- log scale. Symbols represent simulations of the IR method with (3 = 1 (triangles) and (3 = 1/2 
(filled squares) in eq. JSJ; Diamonds and filled circles show simulations of d a , where the estimator of 
the variance has been corrected for the bias. The corresponding theoretical predictions, calculated as 
explained in the text, fit the data very well. In the inset a similar plot shows the coincidence between 
the predicted and simulated plateau reached by d q for large M. Parameters of the simulations: 
N = 100, intrinsic values Qi uniformly distributed between 10 and 20 and standard deviations at 
uniformly distributed between 1 and 5; averaged over 10 3 realizations. 



raters (y) = jt yi. Equipped with this formalism, we perform tedious but straightforward 
calculations to obtain the following asymptotic expansions, for M,N — * oo, to the first two 
dominating orders: 

(11) 
(12) 

(13) 

with complicated constant coefficients These expressions simplify considerably when 

taking the limit /3 = 1/2 and (3 = 1. For instance, eq. 1)12(1 takes the forms Bias l g=i(Vi) ~ 
-1/ (N{a- 2 )) and Biasp =1 / 2 (Vi) ^ l/{N(l/a} 2 ) - 2<r l / (A^- 1 )). The analytical solution 
allows one to find an unbiased estimator for <r? -up to 0(1/N 2 ,1/NM). In applications 
we may use eq. Q as an estimator of a 2 to evaluate Bias(V^) and redefine the weights as 
Wi = l/(Vi — Bias(Vi)). Since we have here d qi = sj, suffices to plug eqs. ifTTl - IT^Il in eq. ijjj) to 

.<1/~2-63\ i 2—43,2/ -2(/3+l)\ / -4B\j 2-4/3, „ , n 

(^They are given by: Ci = + 2 ^ '> - 6 < g { ?_% )2 C 2 = A(a^) + (2/3 - 

l\GOB 2) <£+2mm^h V , - 2 ^ 2 " (° 2 - iP )< 2 2<^- 4g > , (a^hi^^h and Vo - 

ljLr(4/3 - J.) ( CT -2/5) > U l - ( a -W) ( a -2H)2 < CT -2/S)2 H ( a -2d)3 and U 2 - 

Q3-l)m? _ (0 + 1)0(213) , r, <<?- 2W + 2) )<?~ 2 ' 
2 ( CT -2/3) h,! ( CT -20) • 
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Fig. 2 - Average squared difference between estimators and intrinsic values, for quality (main) and 
variance (inset), plotted in log-log scale as a function of N, with M — 100, for j3 — 1/2, 1. Symbols 
represent simulation results of the IR method, lines are the corresponding theoretical predictions. 
The dotted line represents d q when the quality estimator is just the straight average. 



find theoretical expressions for the mean squared errors. They are shown to match numerical 
simulations in figs. Q and [3J 

In fig. ^ the mean squared error of the variance d a = jj d Gi is plotted against M in 
log-log scale. Our theoretical prediction becomes very good as soon as M > 10. Diamonds 
and filled circles show simulation results of the IR method where the biased estimator of the 
variance has been corrected by recursive use of eq. (|12fl : the plateau reached by d a for large 
M disappears because the accuracy of the prediction can be thus improved by two orders of 
magnitude. The mean squared error of the quality d q — jt d qi , on the other hand, can 
never vanish for large M when N is finite. This is shown in the inset of fig. ^ while the 
dependence of d q on N is reported in fig. [21 We have also plotted therein, as a dotted line, 
the behaviour of the same quantity when the estimator of qi is just the average unweighted 
vote received by item I. This illustrates how IR is able to reduce the error. A comparison 
between the two weighting schemes shows that Ui — l/Vi performs almost always better than 
&i = 1/VW- The inset of fig. [21 shows d a vs. TV; the plateau, which is the same for f3 = 1/2 
and 1, vanishes for M — > oo when corrected for the bias as before. 

Camouflage. Let us now restart from the general problem of eqs. 11121) . The case 
we want to analyse here is that of ratings affected by systematic errors that depend on the 
rater but not on the ratee, An = Aj V7. Such a fictitious distortion is instructive to study 
analytically and can be easily generalised to more interesting cases. In fact, as it alters a 
rater's scale of evaluation but not the ranking of her preferences, it can serve as a basis to 
study systems where agents are only asked to sort a set of items in order of increasing quality. 

If one knew the values of A< for all i, one could find the optimal weights {u>l} proceeding 
as described in absence of systematic errors. Upon minimisation of d q ({uk}) with respect to 
the WiS one obtains lu* — A _1 1, with Ay = crfSij + AjAj. Here we have used a more compact 
matrix notation, where 1 is a vector of ones. 

Whenever the deviations Aj are small, limited to a minority of the population or randomly 
distributed around zero, they can be somehow detected. In the general case one can only 
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Fig. 3 - Increase of object's quality as a function of the cheater's rank loss, as the value of A grows from 
to 30. Simulations have been carried on with N = 100, M = 100 and intrinsic values distributed as 
in fig.Q except for oi = 1 and Ql ~ 20. The theoretical estimations are parametric plots of eq. 1141 
for P = 1 and 1/2. 



detect, at best, the relative systematic errors. In fact A = ^ . fjAj does not depend on 
I in presence of camouflage and the relevant quantities only depend on Aj under the form 

~ r i 2 

8i = Aj — A. For instance, the variance can be written as Vt — ^ • fj(yji — yu) + Si . 

This means that, if we change the Aj's while keeping the S^s unchanged, we end up with the 
same result for d q , only translated by the amount A. 

In order to estimate analytically the performance of the IR method, we can posit A = 
and solve eqs. (|3I5|) as before. Thus we find fi({Si}), whose term of order zero is (of + 
Sf)' 13 /J2j( a j + $j) ~^ ■ This way we have a formal solution as a function of 5{, which must 
comply with the constraint f& = and can eventually be recovered numerically. 

Cheating. - It is interesting to consider the case of one intentional cheater / wanting to 
boost the value of object L of an amount A, all other raters being honest: An = 8u5ilA. 
Agent I commits no systematic error in evaluating all objects but L. Still, she would loose 
credibility and weight as A becomes larger; this would eventually diminish her relative in- 
fluence over object L. It is important to evaluate the difference Sqi = qi(A) — qi between 
the estimated value of the object with and without the friendly uprating. In fact a small 5q, 
compared to the lost in credibility of the rater, discourages cheating, and vice-versa. 

The variance, as defined in eq. I@J, can be written as a function of A and of the normalised 
weights. Hence V^(A, {fi(A)}) = Vi(0, {/i(A)}) + 5uA 2 /M, where the formal expression of 
Vi(0, {fi(A)}) is equal to that of eq. ifTTijl . Iterative asymptotic expansions can be performed 
the same way we did in absence of systematic errors. In this case the variables yu are equal to 
the xu, except for y IL = x IL -A. Therefore eq. © becomes qi(A) = J^i /i(A)y 4 ; + A^ L //(A), 
which implies qi(A) — qi ~ ASilJi- For A <§; \/M the average deviation reads {8qL) — 
A ({fi) — fi'Jzjij ■ If t ne value of A is comparable to \/M, on the other hand, the zeroth 
order of the correction at the thermodynamic limit amounts to 



(5q L ) - A 



A 2 \ 13 



(14) 
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In fig.[3]cq. i|14|) is shown to fit the simulations fairly well in the rank space. We have compared 
thereby the scheme (3 = 1 (circles) with (3=1/2 (stars) in the worst case: the best agent is 
trying to raise the worst object. In the region of moderate cheating the LOi = weighting 
scheme is less sensitive to cheating. This is particularly important left to the x = y line, where 
the cheater pays less then what she offers to the object and cheating can be advantageous. 
However, the relative influence of the cheater is a growing, although saturating, function of A. 
Under the Ui = 1/Vi weighting scheme, on the other hand, such an influence starts decreasing 
once passed a crossover value. There the cheater's reputation is so much damaged by her 
misbehaviour that, if she attributed a higher value to object L, its estimated rank would 
diminish. Optimal weights are, therefore, much more resilient to severe cheating. 

We just remark that, taking averages without refinement, a cheater would indefinitely 
increase an object's rank without undergoing any punishment. The transition to the cheater's 
unfavorable region is the solution of d r ( g ) = d r{a) in the A space. 

Conclusion. - In this letter we have analyzed a novel scoring system that aggregates the 
evaluations of N agents over M objects by use of reputation and weighted averages. Agents, 
as a result, are ranked according to their judging capability and objects according to their 
quality. The method can be implemented via an iterative algorithm, where the intrinsic bias 
of the estimators of the weights can be corrected. We show, with simulations and analytical 
results, that the method is effective and robust against abuses. The larger the system, the 
better is the filtering precision. This method can be applied in web-related reputation and 
scoring systems. 

* * * 
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